- Wednesday, July 31, 2024
OpenAI has begun rolling out an advanced voice mode for ChatGPT Plus subscribers, featuring four preset voices trained with voice actors. The feature, previously delayed for safety reasons, is expected to be available to all ChatGPT Plus users in the fall.
- Wednesday, September 25, 2024
OpenAI is rolling out Advanced Voice Mode (AVM) to ChatGPT Plus and Team users. Enterprise and Edu customers will start receiving access next week. The upgrade makes ChatGPT more natural to speak with and gives the feature a revamped design. It also adds five new voices and adds some of ChatGPT's customization features, including Custom Instructions and Memory.
- Wednesday, July 31, 2024
OpenAI is launching a new AI voice chatbot powered by the GPT-4o model with features that include natural, fluent conversation capabilities. It is initially available to select ChatGPT Plus subscribers.
- Friday, September 27, 2024
OpenAI has introduced an upgraded feature called Advanced Voice Mode (AVM) for its ChatGPT application, aimed at enhancing the conversational experience for users. This new audio capability is being rolled out to subscribers of the Plus and Teams tiers, with plans to extend access to Enterprise and Edu customers shortly thereafter. The update includes a fresh design, replacing the previous animated black dots with a blue animated sphere, which signifies the availability of AVM. As part of this rollout, users will be notified through a pop-up in the ChatGPT app when they can access the new voice feature. The update also introduces five additional voices—Arbor, Maple, Sol, Spruce, and Vale—bringing the total to nine. This selection of voice names is inspired by nature, reflecting the goal of making interactions with ChatGPT feel more organic and natural. Notably, the voice named Sky, which had drawn legal concerns from actress Scarlett Johansson, is not included in this release. While the Advanced Voice Mode enhances the audio experience, some features from previous updates, such as video and screen sharing capabilities, are still pending. These features were designed to allow ChatGPT to process visual and auditory information simultaneously, but OpenAI has not provided a timeline for their release. Improvements have been made to AVM since its initial alpha testing, with claims of better accent recognition and smoother conversations. Additionally, OpenAI is expanding customization options within AVM, including features like Custom Instructions, which allow users to tailor ChatGPT's responses, and Memory, enabling the AI to remember past interactions for future reference. However, it is important to note that AVM is not yet available in several regions, including the EU, the U.K., and parts of Scandinavia. Overall, the introduction of Advanced Voice Mode represents a significant step in making AI interactions more engaging and user-friendly, while also addressing previous concerns and limitations.
- Tuesday, March 5, 2024
OpenAI introduced a Read Aloud feature for ChatGPT, allowing it to verbally respond to users in 37 languages with five voice options and enhancing its multimodal capabilities and usability for on-the-go interactions across web and mobile platforms.
- Tuesday, May 14, 2024
OpenAI announced a new AI model yesterday called GPT-4o that can converse using speech in real time, read emotional cues, and respond to visual input. It will roll out over the next few weeks for free to ChatGPT users and as a service through API. Paid subscribers will have five times the rate limits of free users. The API will feature twice the speed, 50% lower cost, and five times higher rate limits compared to GPT-4 Turbo. A 26-minute long video that introduces GPT-4o and demonstrates its abilities is available in the article.
- Friday, October 4, 2024
OpenAI recently launched its annual DevDay event in San Francisco, marking a significant shift in how the company engages with developers. This year's event introduced four major API updates designed to enhance the integration of OpenAI's AI models into various applications. Unlike the previous year, which featured a keynote by CEO Sam Altman, the 2024 DevDay adopted a more global approach, with additional events scheduled in London and Singapore. One of the standout features unveiled at the event is the Realtime API, now in public beta. This API allows for speech-to-speech conversations using six preset voices, simplifying the process of creating voice assistants. Previously, developers had to juggle multiple models for different tasks, but the Realtime API enables them to manage everything with a single API call. OpenAI also plans to enhance its Chat Completions API by adding audio input and output capabilities, allowing for more versatile interactions. In addition to the Realtime API, OpenAI introduced two new features aimed at helping developers optimize performance and reduce costs. The first, "model distillation," allows developers to fine-tune smaller, more affordable models using outputs from advanced models, potentially improving the relevance and accuracy of the results. The second feature, "prompt caching," speeds up the inference process by remembering frequently used prompts, offering significant cost savings and faster processing times. Another notable update is the expansion of fine-tuning capabilities to include images, referred to as "vision fine-tuning." This allows developers to customize the multimodal version of GPT-4o by incorporating both images and text, paving the way for advancements in visual search, object detection for autonomous vehicles, and medical image analysis. The absence of a keynote from Sam Altman this year was a notable change, especially given the dramatic events surrounding his leadership in the past year. Instead, the focus was placed on the technology and the product team. Altman did attend the event and participated in a closing "fireside chat," reflecting on the significant changes OpenAI has undergone since the last DevDay, including a drastic reduction in costs and a substantial increase in token volume. Overall, the 2024 DevDay emphasized OpenAI's commitment to empowering developers with new tools and capabilities while navigating the complexities of its recent organizational changes. The event showcased a clear direction towards enhancing AI applications and fostering innovation in the developer community.
- Tuesday, May 21, 2024
OpenAI paused the "Sky" voice mode for ChatGPT-4o after backlash for allegedly mimicking Scarlett Johansson's voice from the 2013 film Her. The actress is also pursuing legal action against the AI firm.
- Tuesday, June 4, 2024
AI is leading to a revolution in communication spurred by OpenAI's GPT-4o, which integrates audio, vision, and text in real time. This shift enables more natural interactions with AI, transforming human-to-AI communication into a central mode of digital interaction and potentially leading to significant societal changes and new startups focused on AI-centric communication.
- Tuesday, June 11, 2024
Apple announced at WWDC 2024 that it is integrating ChatGPT into a variety of native experiences on iOS devices. It plans to eventually expand to use voice and other capabilities.
- Monday, May 13, 2024
OpenAI has announced a live stream event scheduled for May 13 to present updates related to ChatGPT and GPT-4, possibly including the launch of an AI-powered search engine.
- Tuesday, April 2, 2024
OpenAI's Voice Engine is a model that generates speech mimicking a speaker's voice from a 15-second audio sample. It can be used in applications like educational aids, translation, and support for non-verbal individuals. OpenAI is employing a cautious approach to deployment due to potential misuse.
- Tuesday, July 9, 2024
OpenAI's ChatGPT has varied performance in code generation, with success rates ranging from less than 1% to 89% depending on factors like task difficulty and programming language.
- Friday, October 4, 2024
OpenAI has recently unveiled a new interface for ChatGPT called "Canvas," designed specifically for writing and coding projects. This innovative feature introduces a separate workspace that operates alongside the traditional chat window, allowing users to generate text or code directly within this canvas. Users can highlight portions of their work to request edits from the AI, enhancing the collaborative experience. The Canvas feature is currently in beta, available to ChatGPT Plus and Teams users, with plans to extend access to Enterprise and Edu users shortly thereafter. The introduction of editable workspaces like Canvas reflects a broader trend among AI providers, who are increasingly focusing on creating practical tools for generative AI applications. This new interface is similar to offerings from competitors such as Anthropic’s Artifacts and the coding assistant Cursor. OpenAI aims to keep pace with these competitors while also expanding the capabilities of ChatGPT to attract more paid users. While current AI chatbots struggle to complete extensive projects from a single prompt, they can still provide valuable starting points. The Canvas interface allows users to refine the AI's output without needing to rework their initial prompts, making the process more efficient. Daniel Levine, a product manager at OpenAI, emphasized that this interface facilitates a more natural collaboration with ChatGPT. In a demonstration, Levine showcased how users can generate an email using ChatGPT, which then appears in the canvas window. Users can adjust the length of the email and request specific changes, such as making the tone friendlier or translating it into another language. The coding aspect of Canvas offers unique features as well. For instance, when generating code, users can request in-line documentation to clarify the code's functionality. Additionally, a new "review code" button allows users to receive suggestions for code improvements, which they can approve or modify. OpenAI plans to make the Canvas feature available to free users once it exits the beta phase, further broadening access to this enhanced collaborative tool.
- Monday, May 27, 2024
GPT-4o, OpenAI's latest AI model, bridges real-time communication between humans and machines, extending capabilities beyond text to include vision and audio. The AI revolution introduces a new wave of human-to-AI and eventual AI-to-AI interactions, likely impacting the dynamics of our social behaviors and business models. As this technology progresses, its effect on human communication will unfold, potentially catalyzing the creation of innovative companies and software solutions.
- Friday, October 4, 2024
OpenAI has recently unveiled a new interface for ChatGPT called "Canvas," designed specifically for writing and coding projects. This innovative feature introduces a separate workspace that operates alongside the traditional chat window, allowing users to generate text or code directly within the canvas. Users can easily highlight sections of their work to request edits from the AI, enhancing the collaborative experience. The Canvas feature is currently in beta, available to ChatGPT Plus and Teams users, with plans to extend access to Enterprise and Edu users shortly thereafter. The introduction of editable workspaces like Canvas reflects a growing trend among AI providers, who are recognizing the need for practical tools that facilitate the use of generative AI. This new interface aligns with similar offerings from competitors, such as Anthropic’s Artifacts and the coding companion Cursor. OpenAI is actively working to enhance its capabilities and attract more paid users by introducing features that meet the demands of its audience. While current AI chatbots struggle to complete extensive projects from a single prompt, they can still provide valuable starting points. The Canvas interface allows users to refine the AI's output without needing to rework their initial prompts, making the process more efficient. OpenAI's product manager, Daniel Levine, emphasized that this interface offers a more intuitive way to collaborate with ChatGPT. In a demonstration, Levine showcased how users can generate an email using ChatGPT, which then appears in the canvas window. Users have the flexibility to adjust the length of the email and request specific changes, such as making the tone friendlier or translating the text into another language. For coding tasks, the Canvas interface offers unique features tailored to developers. Users can prompt ChatGPT to create code, such as an API web server in Python, and utilize an "add comments" button to generate in-line documentation. Additionally, users can highlight code sections to receive explanations or ask questions, and a new "review code" button will suggest edits for user approval. If users approve the suggestions, ChatGPT can attempt to fix any identified bugs. OpenAI plans to make the Canvas feature available to free users once it exits the beta phase, further expanding access to this innovative tool.
- Tuesday, May 14, 2024
Apple is nearing an agreement with OpenAI to integrate ChatGPT technology into the iPhone, potentially featuring it in the upcoming iOS 18 as part of its AI enhancements.
- Thursday, June 20, 2024
OpenAI and Google have introduced advanced AI models that enable real-time multimodal understanding and responses and promise improved AI assistants and innovations in voice agents. OpenAI's GPT-4o boasts double the speed and half the cost of its predecessor, while Google's Gemini 1.5 Flash delivers a significant reduction in latency and cost. Both tech giants are integrating AI across their ecosystems, with OpenAI eyeing consumer markets, which could potentially reach up to a billion users, with its products and partnerships.
- Tuesday, August 20, 2024
Gemini Live, Google's AI-powered voice interaction system, mimics natural conversation but struggles with hallucinations and inaccuracies. Despite incorporating professional actors for more expressive voices, it lacks the customizability and expressiveness found in competitors like OpenAI's Advanced Voice Mode. Overall, the bot's reliability issues and limited functionality make its practicality and purpose unclear, particularly as part of Google's paid AI Premium Plan.
- Thursday, May 30, 2024
OpenAI has signed licensing deals with The Atlantic and Vox Media, allowing their content to train its AI models and be shared in ChatGPT with proper attribution.
- Tuesday, March 26, 2024
Character Voice is a suite of features that allows users to hear Characters speaking to them in 1:1 chats, taking the Character.AI experience to the next level. It is the first step in the company's larger plan to build a multimodal interface, which will facilitate more seamless, intuitive, and engaging interactions.
- Wednesday, May 15, 2024
GPT-4o multimodal abilities, integrating vision and voice, promise significant advances in how AI interacts with the world, paving the way for AI to become a more ubiquitous presence in daily life.
- Wednesday, April 24, 2024
In this article, OpenAI's Evan Morikawa provides insights into ChatGPT's inner workings, covering input text processing and tokenization to prediction sampling using large language models. ChatGPT operates by turning tokens into numerical vectors (embeddings), multiplying them by a weight matrix of billions, and selecting the most probable next word. The tech is grounded in extensive pretraining to predict text based on vast internet data.
- Friday, October 4, 2024
OpenAI has recently unveiled a new interface for ChatGPT called "Canvas," designed specifically for writing and coding projects. This innovative feature introduces a separate workspace that operates alongside the traditional chat window, allowing users to generate text or code directly within this canvas. Users can highlight portions of their work to request edits from the AI, enhancing the collaborative experience. The Canvas interface is currently in beta, available to ChatGPT Plus and Teams users, with plans to extend access to Enterprise and Edu users shortly thereafter. The introduction of editable workspaces like Canvas reflects a broader trend among AI providers, who are increasingly focusing on creating practical tools for generative AI applications. This new interface aligns with similar offerings from competitors, such as Anthropic’s Artifacts and the coding companion Cursor. OpenAI aims to enhance its capabilities in response to competition and to attract more paid users. While current AI chatbots struggle to complete extensive projects from a single prompt, they can still provide valuable starting points. The Canvas feature allows users to refine the AI's output without needing to rework their initial prompts, making the process more efficient. OpenAI's product manager, Daniel Levine, emphasized that this interface facilitates a more natural collaboration with ChatGPT. In a demonstration, Levine showcased how users can generate an email using ChatGPT, which then appears in the canvas. Users can adjust the email's length and request specific changes, such as making the tone friendlier or translating it into another language. The coding aspect of Canvas offers unique functionalities as well. For instance, when users prompt ChatGPT to create a web server in Python, they can add comments to the code for clarity and ask the AI to explain specific sections or suggest edits. A new "review code" feature will allow users to approve or modify the AI's suggestions for code improvements. Once the Canvas feature exits beta testing, OpenAI plans to make it available to free users, further expanding access to this innovative tool.
- Friday, March 15, 2024
OpenAI has announced partnerships with Le Monde and Prisa Media involving the integration of their content into ChatGPT to provide users with interactive and insightful access to news and to aid in model training.
- Wednesday, October 2, 2024
OpenAI recently launched its annual DevDay event in San Francisco, marking a significant shift in how the company engages with developers. This year's event introduced four major API updates designed to enhance the integration of OpenAI's AI models into various applications. Unlike the previous year, which featured a keynote by CEO Sam Altman, the 2024 DevDay adopted a more global approach, with additional events scheduled in London and Singapore. A standout feature unveiled at the event is the Realtime API, now in public beta, which facilitates speech-to-speech conversations using six preset voices. This new API simplifies the development of voice assistants by allowing developers to manage speech recognition, text processing, and text-to-speech conversion through a single API call, streamlining the entire process. In addition to the Realtime API, OpenAI introduced two new features aimed at helping developers optimize performance and reduce costs. The first, "model distillation," enables developers to fine-tune smaller, more affordable models using outputs from advanced models, potentially enhancing the relevance and accuracy of the outputs. The second feature, "prompt caching," accelerates inference by remembering frequently used prompts, offering a 50% discount on input tokens and improving processing times. OpenAI also expanded its fine-tuning capabilities to include images, allowing developers to customize the multimodal version of GPT-4o with both text and images. This advancement opens up new possibilities for applications such as visual search, object detection in autonomous vehicles, and medical image analysis. The absence of a keynote from Altman this year was notable, especially given the dramatic events surrounding his leadership in the past year. Instead, the focus was placed on the product team and the technology itself. Altman did attend the event and participated in a closing "fireside chat," reflecting on the significant changes since the last DevDay, including a dramatic decrease in costs and a substantial increase in token volume across OpenAI's systems. Overall, the 2024 DevDay emphasized OpenAI's commitment to empowering developers with advanced tools and features while navigating the complexities of its internal dynamics and the broader AI landscape.
- Monday, April 22, 2024
This article discusses the transformative potential and current limitations of generative AI like ChatGPT, noting that while it excels in tasks like coding and generating drafts, it struggles with complex tasks that require specific programming. It highlights the need for a vision that matches AI solutions with practical applications, emphasizing that identifying and integrating these into daily workflows remains a significant challenge.
- Monday, May 27, 2024
Scarlett Johansson alleges that OpenAI created a voice for ChatGPT that mimics hers without consent, leading her to seek legal advice. OpenAI has since paused using the voice and is discussing the issue with her representatives. The situation highlights concerns over the use of celebrity likenesses in AI applications.
- Thursday, March 21, 2024
All future versions of Redis will be released with source-available licenses starting with Redis 7.4. Redis will no longer be distributed under the three-clause Berkeley Software Distribution and will be instead dual-licensed under the Redis Source Available License and Server Side Public License. The new licenses enable Redis to sustainably provide permissive use of its source code, allowing it to continue to be freely available to developers, customers, and partners through Redis Community Edition as Redis moves on to its next phase of development as a real-time data platform with a unified set of clients, tools, and core product offerings.
- Thursday, March 21, 2024
Neuralink has shared a video of its brain-computer interface in action inside a human patient. The patient, a 29-year-old man named Nolan Arbaugh who is paralyzed from the neck down, is able to control a cursor on a screen using Neuralink's device. He is able to play games online for about eight hours before the implant needs to recharge. Arbaugh reports that his experience with the implant has so far been positive, despite some initial issues. The video is available in the article.